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Abstract — Understanding the behaviors of information propagation is essential for the effective exploitation of social influence in social 
networks. However, few existing influence models are both tractable and efficient for describing the information propagation process and 
quantitatively measuring social influence. To this end, in this paper, we develop a linear social influence model, named Circuit due to its close 
relation to the circuit network. Based on the predefined four axioms of social influence, we first demonstrate that our model can efficiently 
measure the influence strength between any pair of nodes. Along this line, an upper bound of the node(s)' influence is identified for potential 
use, e.g., reducing the search space. Furthermore, we provide the physical implication of the Circuit model and also a deep analysis of 
its relationships with the existing methods, such as PageRank. Then, we propose that the Circuit model provides a natural solution to the 
problems of computing each single node's authority and finding a set of nodes for social influence maximization. At last, the effectiveness 
of the proposed model is evaluated on the real-world data. The extensive experimental results demonstrate that Circuit model consistently 
outperforms the state-of-the-art methods and can greatly alleviate the computation burden of the influence maximization problem. 

Index Terms — Social Influence Model, Circuit, Influence Spread, Authority, Social Influence Maximization 
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1 Introduction 

Social networks make connections among individuals. Usually, 
in this social paradigm, people tend to connect with their 
friends, family members or colleagues, which makes the 
connections in social networks are a kind of trust relationship. 
Under this relationship, if somebody do something, her friends 
tend to believe something is good or trustable. For example, 
suppose a man bought a new product and shared his pleasant 
experience about it on social network site, then his social 
friends would be likely to be influenced by his experience 
and may take it as an advice when they want to buy a similar 
product. This is a perfect effect on product marketing and 
information propagating. There are two obvious reasons for 
this. Firstly, the recommendation from one's friends is more 
likely to be accepted. Secondly, this effect could trigger a 
domino effect, e.g., if a product is adopted and shared by 
someone, then her friends may take it as an advice to adopt it 
also, then her friends' friends and so forth. This effect is so- 
called "word-of-mouth" or "viral marketing" effect and has 
been investigated for a long time 0, 0, fM, E), lfl3l . 
1 20 1, [22]. Marketing persons, news communicators are both 
wondering how to take advantage of this effect to improve 
their work on social network platform. 

To this end, it is preliminary to model the influence between 
individuals. Influence is the effect that an individual has on 
the other ones when they are making decisions or behaving, 
the amount of which could be viewed as a probability — 
roughly speaking, suppose individual A has tried M things 
and individual B tried N of those things following A, then the 
amount of influence from individual A to B is N/M, which, 
ranged between and 1, could be viewed as a probability. 
If we could model the influence between individuals and get 
its quantity, then we could take advantage of it to design 
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the strategy for product marketing or information propagating. 
However, practically, we still need to model the influence of 
group of individuals. For example, suppose there are more than 
one person sharing their experiences of a product, you will be 
effected by their combination influence. This type of influence 
is subtle, which is different from a single influence or the 
sum of those single influences. The modeling of combination 
influence is very useful. When the marketing person design 
a viral marketing campaign, they always select more than 
one individuals to endorse their product, then, the influence a 
person received from a viral marketing campaign are usually 
combination influence from multiple persons. 

In recent years, there has been many theoretical and em- 
pirical studies on social influence modeling. Anagnostopoulos 
et al. [4] proposed two statistical tests to distinguish social 
influence from the multi-sources of correlation(i.e. homophyly, 
confounding and influence respectively) between the actions 
of friends in a social network. Goyal et al. Ifl4l studied how to 
learn the amount of social influence between adjacent individ- 
uals. Granovetter et al. [ 15 1 proposed a model, called as Linear 
Threshold(LT) model to simulate the information propagation 
process and give the amount of social influence between 
any pairs, while Goldenberg et al. [12] proposed another 
model, called as Independent Cascade(IC) model. However, 
both of them are operational models and are untractable and 
inefficient. Under these models, you couldn't find a closed- 
form solution for social influence; and if you want to get 
the influence of an individual on the others, you have to run 
Monte-Carlo simulations for a sufficiently many times (e.g. 
20000 times) to obtain an approximate estimate [9|, which is 
very time-consuming. 

To alleviate these obstacles, in our preliminary work 11241 . 
we proposed a circuit inspired linear model to describe the 
influence between individuals. Specifically, we adopt a two- 
stage strategy to achieve this goal. In the first stage, we propose 
a rule-based definition to model the influence between pair of 
single individual and obtain a closed-form solution which a 
probabilistic influence matrix in which any (/, j)th-entry is the 
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strength of influence from node i to node j. In the second stage, 
we propose a concept "independent influence" with which we 
form the formula for the influence of group of individuals. 
Under this model, social influence is tractable and could be 
computed efficiently by a fast Gauss-Seidel iteration method. 
In addition, we propose a upper bound to estimate the total 
influence of a node in the network, which could help us to 
identify those actually influential nodes. Finally, by exploiting 
the influence model and using the upper bound to select seeds, 
we propose a novel method to solve the well-known viral 
marketing problem. The experimental results demonstrate that 
this method outperforms the state-of-the-art algorithms both 
in efficiency and effectiveness. 

In this paper, we further study the linear model for influence. 
Unlike the two-stage strategy adopted in the preliminary study, 
we uniformly model the influence of individual and group 
of individuals by an axiomatic definition, and then find its 
closed-form expression. Moreover, this expression could be 
solved by a fast Gauss-Seidel iteration method. Along this 
line, we further find a compact upper bound to estimate the 
total influence of individual or group of individuals, which 
is helpful to evaluate whether or not an individual or a 
group of individuals is influential enough. On the power 
of this upper bound, we solve the viral marketing problem 
effectively. What's more, we find that when we use the upper 
bound as the approximation of its real number, the problem 
could be solved in nearly linear time and the experimental 
results on variety of networks demonstrate that this method 
could produce a performance better than the state-of-the-art 
algorithms both in efficiency and effectiveness. In addition, 
we also seek the relationship between our model and other 
traditional models (e.g. independent cascade model 1121 and 
stochastic model |3|) by theoretical and empirical ways. 

The rest of the paper is organized as follows. Section [2] 
presents the latest related work about influence model and 
social influence maximization. In Section [3j we propose an 
axiomatic definition to model the influence of a set and then 
deduces its closed-form expression which could be solved by 
a fast Gauss-Seidel iteration method. In this section, we also 
propose an upper bound to estimate the total influence of a set 
on the network. In Section |4] we propose the circuit simulation 
of the model and seek the relationship between our model and 
other traditional models. In Section [5] we adopt the influence 
model to solve the well-known social influence maximization 
problem and propose two novel method, i.e. Circuit-Complete 
and Circuit-Fast. In Section|6] we demonstrate three claims of 
this paper through experiments: linear model is close related 
to the traditional models; upper bound is consistently close 
to the real total influence; and Circuit-Complete and Circuit- 
Fast outperform the state-of-the-art algorithms. In Section [7] 
we conclude this paper and propose several problems to be 
solved in the future. 

2 Related Work 

Related work can be grouped into two categories. In the first 
category, we describe some existing social influence models. 
The second category includes the existing works for the social 
influence maximization problem. 



Social Influence Models. In the literature, many studies 
about social influence have been published. For instance, 
Anagnostopoulos et al. (ID proved the existence of social 
influence by statistical tests. Also, Goyal et al. 04] studied 
how to learn the true probabilities of social influence between 
individuals. In addition, there are several models to infer how 
the influence propagates through the network. For example, 
Granovetter et al. [15] proposed the Linear Threshold(LT) 
model to describe it, while Goldenberg et al. [12| proposed 
the Independent Cascade(IC) model. Since these two models 
are not tractable, Kimura et al. [17| proposed a comparably 
tractable model SPM and Aggarwal et al. [3| proposed a 
stochastic model to address this issue. Recently, Easley et 
al. ifTTI and Aggarwal et al. J2] summarized and generalized 
many existing studies on social influence and some other 
research aspects of social networks. More importantly, they 
demonstrate that by carefully study, the information exploited 
from social influence can be leveraged for dealing with the 
real-world problems (e.g., the problems from markets or social 
security) effectively and efficiently. 

Social Influence Maximization. As an application, there 
is an important research branch to exploit social influence for 
marketing, which is called as viral marketing and target at 
finding a small set of "influential" individuals (those individ- 
uals is called as "seed") of the network — giving them free 
samples of a product — for triggering a cascade of influence 
by which friends will recommend the product to other friends, 
hoping the product will be adopted by a large fraction of the 
network. 

There are many literatures which aimed at solving this 
problem, here we list part of them as representation. At the 
beginning, Domingoes and Richardson proposed this problem 
firstly ifTUl . l22l . Kempe et al. formulated this problem as 
a discrete optimization problem and they proved that the 
optimization problem is NP-hard, and presented a greedy 
approximation (GA) algorithm which guarantees that the in- 
fluence spread result is within (1 — 1/e) of the optimal result. 
To address the efficiency issue, Leskovec et al. |19| presented 
a "Lazy Forward" scheme (called as CELF optimization) 
which take advantage of the submodularity property of the 
influence maximization objective to reduce the number of 
evaluations on the influence spread of individuals. To address 
the scalability issue, Chen et al. proposed several heuristic 
methods includes DegreeDiscountIC [9| and PMIA [8 1 which 
uses local arborescence structures of each individual to ap- 
proximate the social influence propagation. Wang et al. 11231 
presented a community-based greedy algorithm to find the 
Top-K influential nodes. They first detect the communities in 
social network and then find influential nodes from the selected 
potential communities. 

3 Social Influence Modeling 

Social influence refers to the behavioral change of individuals 
affected by others in a network. Social influence is an intuitive 
and well-accepted phenomenon in social networks ATI . Here, 
we will provide a quantitative way to measure the social 
influence. To facilitate the following discussion, we list the 
important math notations used in this paper in Table Q] 
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TABLE 1 
Math Notations. 



Notations 


DESCRIPTION 


fs->T 


the influence from a group of individuals S to another group of individuals T, 
where <i> and 7 could be single individual. 


f 6' = \fs-*\<fs-*%>— fs-mY 


the influence vector from S to each member of *V, where >S could be single individual. 


F — \_fij]n*n 


Influence matrix where fij is the influence from i to j and equal to fi-,j. 


T — [?fj]n*n 


transmission matrix where tu is the transmission probability from ; to j. 


A = diag(A l ,A 2 , ...A„) 


Damping coefficient matrix where Aj is the damping coefficient of i. 


P = [py]»„ = (I + A-T')- J 


Basis matrix where py is the basis element to compound influence. 


Pr = [pi^t,P2^t,-P„^tY 


Potential vector where p;-*T = HieT Pij 


bs^T 


the upper bound of fs^T, equals to Y,jes(( l + Aj) - ~£ keS t kj )p HT 


Q = diag(e l ,0 2 ,-0„) 


0j = tji i s me tota ' transmission probabilities flowing into individual ;'. 



Let Q = CV,£) be a social network, where the node set 
*V = {1,2, ...,n) includes all of individuals, the edge set £ 
represents all the social connections which could be viewed 
as trust relationships. We denote the influence from a group 
of individuals S to another group of individuals T as fs^r, 
where S and T are subsets of < V. When S and T are sets with 
only one element, fs^r lS the influence between pair of single 
individuals. Under this notation, the viral marketing campaign 
design could be formulated as the following optimization 
problem 

S = ?sgmax Sc ^r V fs^,<v subject to \S\-K (1) 

In this paper, we propose four axioms to model the general 
influence fs^r as follows. 

Axiom 1: The influence from S to 7~ is equal to the sum 
of influence from S to each member of T~, that is 

&-r = 2>-*- (2) 

ieT 

Axiom 2: If i is a member of S, the influence from S to i 
should be always equal to 1, that is 

fs^t = 1 for i € S. (3) 

Axiom 3: Influence could transmit through the trust con- 
nection in network with a certain transmission probability on 
it. 

Axiom 4: The influence to an arbitrary individual is deter- 
mined by the influences to her trust-friends. Suppose j's trust- 
friends set is Nj = {j'1,7'2, ■•■./)«! (i-e. Vfc e A^;, there is a trust 
connection (j, k) e E) and the influence to k e Nj is fs^k, then 

fs .;-/;•'; ,ls .;■ /; ,ls > forjiS (4) 

where /}(*) is a combination function for j and fjy is the 
transmission probability on trust connection (j, k) 0. 

Based on the above four axioms, there are two factors which 
will determine the shape of the social influence model. 

The first factor is the transmission probabilities on each trust 
connection. In this paper, we use an assumption to confine the 
probability, that is 



Assumption 1: The sum of transmission probabilities flow- 
ing into one node should be less than or equal to 1. That is, 

6i = ^ tji < 1 for i = 1,2...« 

7=1 

where tji is the transmission probability from node j to node 
2. If (/, <£ &, then t jt = 0. 

Actually, this assumption is used for measuring the amount 
of information (e.g., with regard to an event or message) that 
will be accepted by each node. The corresponding value varies 
in the range of [0,1], where stands for the ignorance of the 
information and 1 means this node totally believes in it. 

The second factor is the way how an individual combine 
the influences receiving from her trust-friends. For instance, 
Aggarawal et al J2 | proposed a way to describe this function, 
that is 

fs^j= 1-TheNja-tkjfs^k) (5) 

which claims that the transmitted influences from different 
friends should be independent to each other. This is a the- 
oretically reasonable way, however it is too complex to get its 
closed-form solution. Thus, in this paper, we propose a linear 
way. That is, 

f^J = TTT Z tk ^ k for j * S (6) 

where Aj is the damping coefficient of j for the influence 
propagating. It locates in range (0, +oo). The smaller Aj is (i.e., 
approaching to 0), the less the information will be blocked 
by node j. In real applications, this number may also varies 
from the topics of the propagating information. For instance, 
if node j favors the topic of the propagating information, Aj 
will approach to 0, otherwise, it will approach to a big positive 
number, even +oo. 

3.1 The Deduction of Influence 

For Equation [6] only describes for the ones not in S, we first 
reform it to describe all individuals, including the member of 
S, as follows. 



1 . Notably, in social networks, the direction of trust connection is inverse to 
the direction of influencing, which means that if i trusts j then j will influence 



fs^i = "j — ~~r y (tjifs^j + vsj) for i=\,2,...n (7) 
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where vsj is a correction to guarantee that, /$_>; is equal to 1 if 
i is a member of S and y^- Xfc=iv> hjfs^k otherwise (Axiom[2] 
and Axiom 0J. Thus, the value of v^ , could be determined as 

( a number to guarantee /$->,■ = 1 i e S 

Equation [7] for z = 1,2, ...« could be rewritten as 
f lS = (I + A)- 1 (T'f 5 + v 5 ) 

where 

ts = [/s-» 1 . /s^2 , • ■ ] ' 

T — [^(/]n*n 

= [VS,UVsz, ...V S ,nY 

A = diag(Ai, A2, ...A„) 



which could be solved as 



f 5 = (I + A-TTVs 
= P-v 5 



(9) 
(10) 



where the transpose of (I + A - T') is strictly diagonally 
dominant, thus it is invertible. In this paper, we denote the 
inverse of (I + A - T') as P = [/>*,]„*„ and call it as basis 
matrix. 

Based on Axiom 2 (the influence from S to the member of 
S should be 1) and Equation [8] from Equation [TOl we can get 

/s->i = Pu v s,j = 1 for ieS (11) 

Suppose S - [si, 52, ---Sk} where K is the cardinality of S , and 
without loss of generality we assume s\ < S2 < ... < sr"- After 
denoting v ss = [v s , Sl , v s , S2 , ...,vs, sg Y, and denoting as the 
matrix which is cut down from P by removing the columns 
and rows not corresponding to members of S. Equation QT| 
could be rewritten as 

Pssvss = e 

where e is a l^l-dimensions vector with all Is. Thus, 



and 



(12) 



Conclusively, we could form the closed-form solution of fs as 
follows. 

Theorem 1: In a network @C~V,&), given the transmission 
matrix T and information's damping coefficient matrix A, the 
influence vector from a set S — {s\, S2, ... Sk] 6 *V (assuming 
Si < *2 < — < sk) to members of the network will be 



f 5 = a+A-T'r 1 v5=Pv 5 

where v s = [vs,i,v s ,2, ...Vs,nY and 



VS,i 



[P^e] 




i — Sk € S 

as 



(13) 



(14) 



where V^s is the matrix which is cut down from P by 
removing the columns and rows not corresponding to members 
of S. 

In other forms, 

f.s = J] ^ (15) 

ieS 

since the vs.t is equal to if i & S. From this equation, we 
could observe that is actually a linear combination of the 
columns of P, that is the reason why we call P as basis matrix. 
Specifically, when S contains only one element, let it is i, 



and thus 



v «yP-i 



fi->j - v mPn 



(16) 



(17) 



For \S\ = 1, Pss is a 1 x 1 matrix and equals to [pa]. Easily, 
based on Equation [T4l we could get 

1 

Pa 

Equation [TBI for i = 1,2, ...n could be rewritten as 



(18) 



F 4 [f,,f 2 ,...f„]' 

= [— P.l,— P 2 ,...— P-„]' 

Pll P22 Pnn 

= diag{Py l P' 



(19) 



where the (z, y')-entry of F is the influence from i to j. Thus, 
we call F = [fij] nm = [p]„*„ as the influence matrix of Q. F 
gives all the influences between any pair of individuals. Given 
F, if want to know the influence from i to j, we only need to 
look up the value at the (z, /gentry of F, that is f^j = = Si. 

3.2 The Computation of f 5 

It seems that, to compute the influence vector 15, it should 
compute two inverse matrices, (I + A-T')" 1 and P^, thus the 
time complexity of this computation should be 0(n 3 ). But, for- 
tunately, based on Equation [15] we only need to compute the 
columns of P corresponding to the members of S. Moveover, 
because the transpose of (I + A - T") is a strictly diagonally 
dominant matrix, it satisfies the convergence condition of 
Gauss-Seidel method, it's inverse could be computed in a very 
fast way through a Gauss-Seidel iteration process. 
Because P is the inverse of (I + A - T'), there is 

(I + A - T')P/ = e,-, 

where P., could be viewed as the variables of this linear system 
of equations. For the transpose of (I + A - T') is strictly diag- 
onally dominant, P., could be solved by Gauss-Seidel method. 
Specifically, Gauss-Seidel method is an iterative method which 
is operated as the following procedures: 
1. Set pf = for ; = 1,2,...«; 

2- PT 1)J '= ik-^ + W'/Pf + ^<jt ljP T\ for j = 
l,2,...n; 

3. continue Step 2 until the changes made by an iteration 
are below certain tolerance. 
This procedures is efficient. To get P., within a valid tolerance 
range, it often need only dozens of iterations. Thus, the 
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time complexity of computing \S\ columns of P is 0(|«S||£1). 
Notably, S is often a set with a small amount of elements, 
then the computation of only consumes constant time. 
Additionally, in the following sections, we will propose a 
method to compute fs in 0(\E\) no matter how many the 
carnality of *S is. 

3.3 An Upper Bound Of f s ^ T 

Based on Axiom [T] we know that fs^r is equal to the sum 
of influence from S to each member of T. If we define 



(20) 



, we could get 

Theorem 2: The amount of influence from a group of 
individuals S to another group of individuals T has an upper 
bound, that is 

fs^r ^Yj {{l+Ai) -Yj tki)p ^ T (2 1 } 

ieS keS 

To prove this theorem, let's first prove a lemma about the 
correction vector v^. 

Lemma 1: The correction vector vs satisfies 



vs. 



;•<(!+ Aj) - *k] forjeS 



(22) 



keS 



Proof: First, let's denote 



r = (I + A - T") 



T ss r 53 
r- r — 

ss ss 



where we rearrange and divide F = (I + A - T) into four 
submatrices based on whether or not the row's or column's 
corresponding individual is a member of set S. From the linear 
algebra theory, we have 

-l 



ss 



p - 

r ss 



P- p — 

ss ss 

M 

- r^r^M 

ss ss 



Ess 



r _ 

1 ss 

r- r — 

ss ss 



- Mr -r 1 

iV11 ss L ss 

ss ss ss ss ss 



where 



M 



Thus, Pss = M and there is 



vss = Pss* = T sse - ^ss r ss T ss e - 
Because F^F-lF^ is a nonnegative matrix 0, we can get 

vss < F^e 

From this inequality, we can get, when j e S, 
v s ,j<(l+Aj)- Y^tkj 

keS 

2. where T-^ is a strictly diagonal dominant matrix, thus its inverse (de- 
noted as N = [«,;]) is a nonnegative matrix. Let's denote K = r^NT^ = 
[kij], there is k fj = Y,itsl^m$s(yil n lm7mj)- Because y a = -t n < 0, y mj = 
—t m j < 0, and «;,„ > 0, jfcy > 0. Thus, K = r^rzlr^ is a nonnegative 
matrix. 



Then, from Equation Q3] based on Lemma Q] there is 
ts = £ vs,P i < + Ai) ~ X tki)V i 

ieS ieS keS 

Thus, fs_>j < £, e5 ((l + A t ) - Xkes hdPjh and then 
fs^T 



□ 



= 2>; 

jeT ieS keS 

= ^((1 +A i )~Y J hdPi-*T 



ieS keS 

Thus, Theorem |2] is proved. 

Discussion. Let's denote p r = [pi^r, Pi^t, ■■■Pn^rY, Pr 
is a quantity that can be computed in 0(\E\) time. Because 

Pi->T = ^jP ji = p 'pT for i = 1,2,..., n 

where e T = [e\,e2, ■■■e„]', e, is equal to 1 if i is a member of 
T and otherwise, thus 



Then, 



p T = P'e r . 



(PTV = d + A-T)p r = er 



which is a linear system of equations with variance p r and 
could be solved by Gauss-Seidel method for (I + A - T) is a 
strictly diagonally dominant matrix. Thus, we could compute 
p r in 0(\E\) time by the procedures similar to in Section [3721 
Thus, if we spend 0{\E\) time to get p r first, then the upper 
bound of influence from S to T will be a number could be 
got instantly. Moreover, because this upper bound proposed 
in Theorem [2] is actually very consistently close to the real 
fs^r IE in this paper we denote 

bs^r = + Aj) - Y tkj)p H r (23) 

jeS keS 

and often use it to substitute for the real fs^r if necessary. 

As a consequence of Theorem 12 we could get the following 
important corollary 

Corollary 1: 



fi^r < (1 + A)Pi^r = bi^r 



(24) 



4 Deep Understandings 

4.1 Another Deduction for Influence And A Physical 
Implication 

In Section I3.ll we proposed a way to rewrite the formula of 
influence and get its closed-form expression. In this section, 
we will propose another way to rewrite that formula and get 
another closed-form expression of it. But in essence, the two 
expressions is equivalent to each other. 
Equation [6] could be rewritten as 

f^J = 7TT X tk ^ k + TTT. Z tk J for j * S 

1 HS 1 ksS 

3. which will be verified in the experimental part of this paper 
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since = 1 if k e S (Axiom[2]i. This equation is equivalent 
to 

( 1 + Aj)fs^j - J] tkjfs^k = Z *kJ f° r J $ S 

HS keS 

which could be rewritten as 



( I + A __ T ^)f s = t s 



ss^s 



(25) 



where and T^ are the matrices cut down from A and 
T by removing the columns and rows corresponding to the 
members of S respectively, f^- is the vector cut down from 
by removing the entries corresponding to the members of S, 
and 

tg = Z f *i](«-|5|)*i for i <£ S 
keS 

For (I+A^j-T^j) is still a strictly diagonally dominant matrix, 
this linear equation system could be solved by Gauss-Seidel 
method in 0(\E\) time. That's means, we could get (and 
thus f,s also) in 0(\E\) time. 

Interestingly, this deduction of influence has a circuit im- 
plication for undirected network Q. If we construct the circuit 
network as follows 

. First, construct a topologically isomorphic circuit network 
of Q, where the conductance between i and j is equal 
to the weight c/j of trust relationship (i, j) (If (i, j) does 
not exist, cy = 0) and guarantees that c -j- — jj- where 

dj = h1=\ c ip 

• Second, connect i t S with an external electrode £, 
through an additional electric conductor with conductance 
(l+^r-fl.orf,- ■ The electric potential value on E t is always 0. 

. Third, put a electrode pole on each j e S with potential 
value 1. 

the potential values on the circuit network (illustrated in 
Figure [1} will be equal to the social influence f$. This could 
be verified quite easily: for each member i of S, because there 
is a electrode pole with potential value 1 on it, its potential 
value will be always 1 which is equal to /s_>,-; for i t S, based 
on Kirchhoff equations |[T8l , there is 

J> = Z cjtVj - u i ) + {1 + Ai ~ 6ddi (0-u i ) = o fonts 



(26) 



and this equation could be reformed as 

which is equivalent to the Equation [6] Thus, the potential 
values on the circuit network will be equivalent to social 
influence vector f^. 



4.2 The Relationship Between Linear Model And Tradi- 
tional Influence Models 

In this section, we will discuss the relationship between linear 
model and the other models to verify the rationality of linear 
model. 




Fig. 1 . The Another Circuit Network. 



4.2. 1 Relationship with Independent Cascade Model 

Independent Cascade(IC) model is a well known and mostly- 
studied influence model. Under this model, if individual i is 
activated at time f, then it will influence her each not-yet- 
activated friend at time t + 1 (and only at time t + 1) with 
a transition probability, until no new individual is activated. 
Although IC model has been mostly studied, its inefficiency 
is always a serious drawback. To alleviate this obstacle, Yang 
et al 1 25 1 proposed a linear system to approximate IC model, 
they verified in both theoretical and experimental aspects that 
IC model could be approximated as 



fl L = (I - TL- ) t 



ss' 



(27) 



when transmission matrix T satisfies that Te < 1, where f— 

S 

is the vector of influences to the individuals not in S under 



IC model. Comparing Equation [27] and Equation 
tic 



we could 



find that, if we set A = 0, = f^. Actually, the approximation 
model in 1 25 1 is a specialization of linear circuit model. And 
linear circuit model could also approximate to IC model. 

4.2.2 Relationship with Aggarawal's Stochastic Model 

In 2011, Aggarawal et al |3 | proposed a stochastic(ST) model 
to model the influence in a network which is as follows 

f 1 i e S 

(28) 



f 



ST 



7=1 

where tji is the transmission probability from j to i and /J^. 
is the influence from S to i under ST model. We can prove 
that 

Theorem 3: If transmission matrix T satisfies that Te < e, 
then for i t S, 



(29) 



Theorem [3] tells that ST model could also be approximated as 
a linear model and the damping coefficient on each individual 
should be ranged in [0, 1). Before proposing the proof of 
Theorem [3] we need to introduce a lemma first. 
Lemma 2: If denote 

n n n 
i'l=l(2=!l+l ifc=4-l+l 
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then where P = [p\,p2, ■■■PnY and V/?, € [0, 1], then 
O k (P)O l (P)>20 k+l (P) 
Proof: Start from the left part of the inequality, 



In summary 



n n 



<'i=l 4=4-1 + 1 i=\ 
n n n 

= Z"" Z Z Ph---PhPi M 

'1 = 1 4=4-1 + 14+1 = 1 
« « 4 

= Z " ' Z Z Pk " ' + 

'i = l 4=4-1 + 14+1 = 1 



where 



» " 'i 

Z " Z Z /'•■ • - 

'i = l 4=4-1+14+1 = 1 

« « 4-1 

Z " Z Z ph " ■ 

'1 = 1 4=4-1 + 14+1 = 1 
« « 4 

+ Z " ■ Z Z Ph " ■ 

'1 = 1 4 = 4-1 + 1 !'i + i=!'i-i + l 

Because of 

n 4 n n 

z Z = Z Z pi * pi *+i' 

4=4-1 + 1 i s+ i=! t _i+l r t =! t _i+l i t+ i=4+l 

we have 

n n i k 

Z"-- Z Z Ph---PkPk^ 

i'i=l i'jt=i'jt_i+l iVn=4_i+l 

n n « 

= Z " ' Z Z Ph " ' 

'1 = 1 4=4-1+14+1=4+1 
= M (P) 

Sum up the above analysis, there is 

O k (P)O x (P) >20 k+1 (P) 



□ 



And if denote pj = tjif^., Equation [28] for / £ 5 could be 
rewritten as 

n n 

fZi = 1 - ri (i - pj> = Z ( - i)i_i °*< p > 



7=1 



With this form and Lemma [2] we could prove Theorem |3]now. 

The proof of Theorem [3] Because T satisfies that Te < e, 
there is £"=i f, 7 < 1, and 0,{P) = £" = i t fi f£. < Z"=i < 1. 
With Lemma [2] we could get 



O k (P) > 20 k+l (P) 



For f™. = T k=1 (-lf- l O k (P) = Om - 2 (P) + 3 (P) - ... + 
(-\)"O n (P), it's easy to get 



flli < Oi(F) 



and 



flit > O x (P) - 02(F) > 0,(P) - l -O l {P) = ho t (P) 



-o 1 (P)<f^ i <o l (P) 



which could be rewritten as 



m;=ii 1 



mOx(F) = mJ]tfif!Zj ^(^i] 



l+^Z^- 

/'=1 



Ai € [0, 1) 



It is proved. 



4.3 Rethinking Authority In the Perspective Of Influ- 
ence 

According to the dictionary, authority means the power of 
someone to influence the others. This interpretation gives 
a natural relation between influence and authority, that is, 
someone's authority is actually the total influence from her to 
the others. In the past years, the computation of authority for 
many things, such as web pages, facebook accounts, twitter 
accounts, has absorbed mountain of attentions due to its 
importance in the internet era. However, there is less work 
to discover the nature of authority. In this section, we will 
rethink the concept of authority in the perspective of influence 
and then propose a more accurate definition of it. 

It's well accepted that pagerank algorithm and its variants, 
such as topic-sensitive pagerank, are the best methods to 
compute the authority of a node in a graph which could be 
internet, web network, twitter etc. Based on @, the general 
pagerank of nodes in a network could be formalized as follows. 
Denote x, = [x' v x' 2 , ...x' n ]' as the pagerank vector for all nodes 
on topic t, then 

x, = dAx, + -T-rj— e; 

where d is a coefficient ranged in (0, 1), A = is a nXn 

matrix with a,j - Q^rnj if there is an edge (J, i) e E and 
otherwise, S, is the set of nodes which belong to topic t, and 
e, = [ei,e2, ■ •■«„]', where e-, — 1 if node i e S, and otherwise. 
Notably, when S, is a set with all nodes in the network (i.e. 
S t = TO, x, will be the general pagerank vector. 
The above equation could be solved as 

x f = (/-^Ar'-i^-e, 



'■s = l+A 



\S t \ 

(i + ^i-Ar 1 ^ 



< 30 > S, 



where, for d e (0,1), A e (0,+co). And £"=i a u 



„ , ., = 1, that is A'e = e, which means A is a 
transmission matrix satisfying Assumption [T| Thus, if we view 
A as T and view AI as A, the matrix (I + AI - A)" 1 could be 
reviewed as the transpose of basis matrix P, that is 



4. wjj is the weight of edge (J, i), usually it equals to 1 
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which could be rewritten as, for i = 1,2, ...« 

/I 

Corollary ^ /I 

|5,|(1 + X) 

bi^s, (31) 

From this equation, we could see that the node z's pagerank 
value on topic t is proportional to the upper bound of /^. 
For the bound is consistently close to real influence, thus it 
could work well on the task of authority estimation. But in 
essence, the following assertion should be true. 

Assertion 1: The individual's authority on a group is essen- 
tially her total influences on each member of the group. 
Based on it, we could propose a definition which may be more 
close to the nature of authority. 

Definition 1: The authority of i on group T is equal to the 
sum of influences from i to each member of 7~, that is 

ShV- 2^fi-*j ( 32 ) 

With Equation [17] 

This equation also could tell us why we call p/_>r as the 
potential of i influencing T. 

5 An Application to Viral Marketing Problem 

Based on the discussion in Section[3] viral marketing Problem, 
also called ass top-/T seeds selection problem or social influ- 
ence maximization problem, which target at finding a small 
set of "influential" members of a network (they are called as 
"seed"s), could be formalized as the following optimization 
problem: 

5 = arg max scrvfs^'V sub ject to |5| = K 

In this problem, fs-t<v is the influence from set 5 to set 
*V, in other words, is the expected number of individuals who 
will be influenced by members of 5 in the social network. 
This number is, conventionally, called as influence spread of 
5 and denoted as cr(S). cr(-) is a submodular function under 
IC model, that is 

Theorem 4: For all the seeds set 5 c T c <V and any node 
s, it holds that 

o JC (S U {s}) - o JC (S) > o JC (T U {s}) - o JC (7') (33) 

where o- IC (S) is the influence spread of 5 under IC model. 

If denote A(5, s) = cr IC (S U {s}) - o- IC (S), as a corollary of 
Theorem |U there is 

Corollary 2: Suppose So c 5i c 52... c Sk and |5,| = i, 
then 

A(5 , s) > A(5i , s) > A(5 2 , «)...> A(5^ , s) (34) 



where A(5, s) denotes the marginal influence spread increment 
when adding s into seed set 5. 

Proposed Algorithm. As illustrated in [16|, the optimiza- 
tion problem of top-K seeds selection is NP-hard, and by 
exploiting the submodular property of <x(5), a greedy strategy 
guarantees to obtain a solution that is within (1 - l/e) of the 
optimal result. In a greedy framework, it always choose the 
individual who can produce the maximal marginal increment 
on influence spread when adding her into 5. The greedy 
algorithm starts with an empty set So = 0, and iteratively, in 
each step k, adds s/< who maximizes the increment on influence 
spread into Sk-i, that is 

s k = arg max seTVSt _ ] A(5 i _i,i) 

until the cardinality of seed set is K. Algorithm Q] describes 
the greedy framework. 

Algorithm 1: Greedy Framework 

1. 5 = 0; 

2. s = arg max jeV ^ iS A(5, s); 

3. 5U = s; 

4. If |5| < K, then go back to step2; else terminate. 



In the framework, step 2 is the most consuming step. Under 
IC model, to get A(5, s), the only available way is to run 
Monte-Carlo simulations of the model for a sufficiently many 
times (e.g. 20,000). It is very inefficient. 

Because linear circuit(LC) model could approximate to IC 
model (see Section l4~2l . in this paper, we use fj£y (i.e., fs-><v 
discussed in Section [3) to substitute for <x(5), that is 

A(5, s) - A 7 (5, s) = f^^y - f&y (35) 

Based on the discussion in Section 14.11 we know that /JfL, 
could be computed in <9(|£1) time, thus A(5, s) could be 
computed in 0(\E\) under linear circuit model. 

Moreover, we could go on with this reduction work. Based 
on the discussion in Section 13.31 bs^y is an estimation for 
fs-w t ^ len we could substitute f£^y by fc^-v further, that is 

A(5, s) - A 7/ (5, s) = bsjiMHv ~ b s ^v (36) 

With Equation [23] this equation could be reformed as 

A(5, «) a (1 + X s - Yj tjs)Ps^v ~ Yj tsjPj^v (37) 

Since p^ = [px^, p2-^<v, Pn-*<vY is a quantity that could 
be computed in advance (see Section [331 ). thus, for any 5 and 
any s, the computation of A(5, s) in Equation [37] only spends 
0(|5|) time. 

Along this reduction way, we could get more profit. Based 
on Corollary [T] there is A 7 (5o, s) = f"^ < (1 + A s )p s - > 'v and 
A /7 (5o, s) = b s ^<v — (1 + A s )p s ^, thus if substituting A(5, s) 
by A 7 (5, s) or A 77 (5, s), Corollary [2] could be reformed as 

(1 + X,)pr*y ^ A(5 , s) < A(5i, *)... < A(S K , s) 

which means that the marginal influence increment of individ- 
ual s can not be larger than (1 + A s )p s ^,<v and her marginal 
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increment in previous iterations. Thus, in each iteration of Al- 
gorithm [TJ when we go into step 2, we have at least one upper 
bound for estimating A(S, s), that is either (1 + A s )p s - ¥ >v or s's 
increment in last iteration. Upper bound, as an estimation for 
real value, can help us to reduce chunk of vain computations. 
For example, if the upper bound of an individual's influence 
increment is not large enough (comparably), it is impossible 
that she will make A(S, s) maximal, thus we could just skip the 
individual. Actually, this property of upper bound could help 
us to skip many individuals with small influence and sharply 
reduce the total computation time. The complete procedures 
for viral marketing problem is illustrated in Algorithm |2] 



Algorithm 2: LinearCircuitMethod(£r, K, A, T) 
input : @CV, <5), K, A, T 
output: S 

<S = 0; 

Compute authority vector 

- \.Pi^> r v,Pi-* ! Vr—Pn-* ! v]' (see it in Section l3~3l >; 
for each vertex s in Q do 
[_ A s = (1 + A s )p s ^<v\ 

while \S\ < do 

re-arrange the order of node to make A, > A s +\\ 

Amax — 0, 

for 5=1 to n — \S\ do 
if A j > A max then 

A s = GetDeltaIGS,i,/5-><v)A,T 
//or A s = GetDeltaII(S, s, A, T, p T ); 
if A s > Ama X then 

Amax — As, 
[_ Smax — 

else 

|_ break; 

S = Su{s}; 
fs^v = fs^v + A max ; 
_ A s = 0; 
return S; 



Function GetDeltaI(5, s, fs^<v, A, T) 

input : S, s, fs-><v> A, T 
output: A s 

If P. s has never been computed, compute it first (see it in 
Section |3. 21 ; 
S' = <SU{s}; 

Compute vs'S' = P^-V e (Theorem Q}; 
fs = 0, fc^v = 0; 
for int j € S' do 
|_ f s , + = vyP v -;//Equation[T5] 

for each j e 'V do 
L fs>-><v + = /5'-j;//Axiom[rj 
return f s ,^ - f s ^; 



In Algorithm |2] we use A s to store the upper bound of 
A(S, s) and use A max and s max to store the maximal A(S, s) 
and its corresponding s. The algorithm starts with an empty 



Function GetDeltaII(S, s, A, T, p T ) 
input : S, s, A, T, p^ 
output: A s 

A s = (1 + A s )p s ^<v; 
for each j e S do 

|_ Aj = A s - tj s p s ^ry - t s jPj^,<V\ 

return A s ; 



set S = 0, at this moment A s = (1 + A s )p s ^ and in each 
iteration, it adds the s with the maximal A(S, s) into S until 
the carnality of S is equal to K. Specifically, in each iteration, 
we first re-arrange the index of individual to make A s > A s+ i 
which can help us to aim at those individuals with big A at 
the beginning and reduce those vain computation spending on 
nobody; then, for each individual i, we compare her upper 
bound A iS with A max . 1) If it is larger, then i maybe a better 
one, then we need to compute its real increment; if her real 
increment is still larger than A max , then this one is truly a 
better one, then we store her index by variable s max and store 
her real increment into A max . 2) If it is smaller, then s and all 
of her successors cannot be better than the current s max for 
A max > A, > A 5+ i; then, we can break out of the iteration. 
When out of an iteration, the index of best supplemental 
individual has been stored in variable s max , we just need to 
add it into S, at the same time, we should add the real 
increment by s max into /s_,<y also. At last, set A w to be 0, 
then, individual s max could not be accessed again. According to 
the way how to compute A(S, s), we call the Algorithm with 
Function IGetDeltail as Circuit_Complete(CC) method, and 
call the one with Function IGetDeltalll as Circuit_Fast(CF) 
method. 

6 Experiment Part 

In this section, we will do the following experiments: a) 
Comparing linear circuit model with IC model and ST model 
to verify the relationship among them; b) Demonstrate the 
effectiveness of upper bound bs^r t° estimate fs^>T\ c ) 
evaluate the performances of Circuit_Complete and Cir- 
cuit_Fast algorithm and compare them with the state-of-the- 
art algorithms on real- world social networks. 

6.1 Date Sets 

The first data, denoted as Polblogs, is a directed network of 
hyperlinks between weblogs on US politics, recorded in 2005 
by Adamic and Glance [1]. There are 1,499 nodes and 19,090 
edges in this network. 

The second data, denoted as Wiki-Vote, is a Wikipedia 
voting network in which nodes represent wikipedia users and 
a directed edge from node i to node j represents that user i 
voted on user j, the network contains all the Wikipedia voting 
data from the inception of Wikipedia till January 2008 0. This 
directed network contains 7,115 nodes and 103,689 edges. 

The third one, denoted as ca-HepPh, is a collaboration 
network which is from the e-print arXiv which covers scientific 

5. http://snap.stanford.edu/data/wiki-Vote.html 
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(a) Polblogs. (b) Wiki-Vote. (c) ca-HepPh. 

Fig. 2. The cosine similarity between LC model and the other two models on Polblogs, Wiki-Vote, ca-HepPh respectively. In 
the three subfigures, the black vertical dash line is a mark of the optimal value of similarity; and the purple horizontal line is 
the level of similarity between IC model and ST model. S iml00(A, B) is the similarity between A model and B model on 100 
randomly selected sets. 



collaborations between authors whose papers have been sub- 
mitted to High Energy Physics - Phenomenology category^ 
This undirected network contains 12,008 nodes and 654,188 
edges. 

The fourth one, denoted as DBLP, is an even larger collab- 
oration network, the DBLP Computer Science Bibliography 
Database, which is the same as in (8). This undirected network 
contains 655,000 nodes and 1,967,265 edges. 

The fifth one, denoted as web-NotreDame, is an webpage 
link network where nodes represent pages from University 
of Notre Dame (domain nd.edu) and directed edges represent 
hyperlinks between them. The data was collected in 1999 by 
Albert, Jeong and Barabasi 0. This directed network contains 
325,729 nodes and 1,497,134 edges. 

The sixth one, denoted as Livejournal, is a friendship 
network crawled from Livejournal @ on July, 2010 [26 ] . This 
is a large-scale network, containing 2,238,731 nodes and 
14,608,137 edges. 

We chose these networks since they can cover a variety 
of networks with sizes ranging from 103K edges to 14M 
edges and include four directed networks and two undirected 
networks. 



6.2 Model Similarity 



In Section 14.21 we proved that linear circuit(LC) model is 
closely related to independent cascade model(IC) and stochas- 
tic(ST) model. In this section, we will verify their relationship 
by experimental results. Suppose f^ c , f^ 7 , and f| r are the 
influence vector of seed set S under LC model, IC model 
and ST model respectively. If LC model is closely related to 
the other two models, f^ c must be similar with f{f and 
for any set S, and vice versa. In this paper ,we use Cosine 
similarity as the metric to measure the similarity among f^ c 
and fjf, f/. That is, 



Sim(f*,f°) = Cos(f%f°) 

6. http://snap.stanford.edu/data/ca-HepPh.html 

7. http://snap.stanford.edu/data/web-NotreDame.html 

8. http://www.livejournal.com 



(38) 



where, A,B are indicators for model and Cos is Cosine 
function. Along this line, we propose a formula to define the 
similarity between models, 



Sim(A,B) = 



Zsc-vSimi^f*) 



(39) 



jScfV 



2) 
3) 



This equation is very exhaustive to be computed for there are 
2'^' choices for S. However, practically, we could randomly 
selected a certain number of sets as representation of the all 
to get an approximation of Sim(A,B). 

On three datasets, polblogs, Wiki-Vote, ca-HepPh, we 
compute the similarities between models under the following 
settings: 

1) randomly select 50,000 sets as representation of the all 
sets; 

set A = XL where A ranges in [0, 1) (see into 
Section loT2b . starts from 0.01 and steps by 0.01; 
set T = D 'W 0, where W is the trust weight matrix 
of Q and D = diag(We). 
The experimental results is shown in Figure [2] We could get 
the following observations: 

. On each data sets, the similarity between LC model and 
IC model could reach a high level (even larger than 0.99); 
the similarity between ST model and the other two models 
is not very stable on different data sets, e.g., on ca-HepPh 
the similarity between ST and IC model is only 0.88; 

. The similarity curves between LC model and IC model 
all increased firstly and then decreased, and reached their 
peaks at a certain A ranges in [0.05,0.15]; 

. When A ranges in [0.10, 0.30], the similarity between LC 
model and IC model keeps in a high level (always larger 
than 0.97) on every data sets; 

. The curve of Sim(A,B) and S imlQQ(A, B) is very close. 
The similarity curve computed on randomly selected 
50,000 sets makes little difference with the similarity on 
100 sets. 

9. It means that the damping coefficients of all individuals are identical, 
which is for a global model but not a personalized model. 

10. In this setting, Te = D 'We = I which satisfies Assumption [TJ and 

,. . - w i< 
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(a) Polblogs. (b) Wiki-Vote. (c) ca-HepPh. 

Fig. 3. The plots of {fs^T,bs^r) and their fitting curve on Polblogs, Wiki-Vote, ca-HepPh respectively. 



These observations tell us that LC model might be a proper 
approximation to IC model; if we want to use LC model to 
approximate to IC model, we'd better assign a value ranging 
in [0.10,0.30] to A; if we want to work out whether or not a 
A value is good to approximate to IC model or ST model, we 
just need to test it by computing the model similarity under a 
few number of sample sets. Based on the above observations, 
we also could explain why it is proper to use fs-t<v to replace 
f's-^'V m Section [5] 

6.3 The Comparison Between f s ^ T and bs-^r 

In Section 13.31 we proved that bg^r - Hjesd^ + ^j) ~ 
Yikes t ki)Pj^r is an upper bound of fs^r- in this section, 
we will explore the relationship between them in the aspect 
of experimental investigation. On the three data sets, pol- 
blogs, Wiki-Vote, ca-HepPh, we compute 1,000 pairs of 
(fs^r,bs^r) respectively and plot them into three coordi- 
nates. The three coordinates are shown in Figure [3] where the 
read lines are the optimal linear curve fitting for those plots. 
From Figure [3] we could observe that 

. The upper bound b$^r is almost linearly correlated to 
influence fs^r, when fitting a linear line to the dots, the 
coefficient of variants are only 0.0303, 0.0116, 0.0302 
respectively; 

. The upper bound b$^r is consistently close to influence 
fs->T, the gradient of the three fitting lines are 1.066, 
1.016 and 1.196, which means that in average b$^r on ly 
exceeds fs^r 6.6%, 1.6%, and 19.6% respectively; 
Because bg^r is consistently close to fs^r, it is feasible to 
substitute b$^ T for fs^r- F° r the computation cost of b$^ T is 
much less than fs->T, thus when we make this substitution in 
practice, the computation cost of real application is probably 
to be sharply reduced. By the way, the experimental settings 
of this part is in some way along with the settings of above 
section: 

. A = AI and A = 0.2; 
. T is set to be *V; 
. T =D -1 W. 

6.4 Viral Marketing Campaign Design (or Top-K Seeds 
Selection) 

In this section, we will use Circuit_Complete (CC) and 
Circuit_Fast (CF) to face the challenge of viral marketing 



problem and compare them with the state-of-the art algorithms 
to verify their effectiveness and efficiency. 

Benchmark Algorithms. The benchmark algorithms 
for viral marketing problem are as follows. First, Cir- 
cuit_Independent (CI) is the Algorithm proposed in |24|. 
CELF is the original greedy algorithm with the CELF opti- 
mization of lfT9l , where the times of Monte-Carlo simulations 
is set to be 20000. PMIA is the algorithm proposed in (8). 
We used the source code provided by the authors, and set 
the parameters to the ones produce the best results 0- In 
the PageRank (PR) algorithm Ell , we selected \op-K nodes 
with the highest pagerank value. DegreeDiscountIC (DIC) [9] 
measures the degree discount heuristic with a propagation 
probability of p = 0.01, which is the same as used in fl9). 
Finally, the Degree (Deg) method captures the top-K nodes 
with the highest degree. Among these algorithms, Degree, De- 
greeDiscountIC and pageRank are widely used for baselines. 
To the best of our knowledge, CELF and PMIA are two of the 
best existing algorithms in terms of solving the viral marketing 
problem (concerning the tradeoff between effectiveness and 
efficiency). 

Measurement. The effectiveness of the algorithms for the 
viral marketing problem is justified by the estimated number 
of individuals that will be influenced by the chosen seed set 
of each algorithm, i.e., influence spread cr(S). To estimate the 
influence spread, for each seed set, we run the Monte-Carlo 
simulation under independent cascade model 20000 times 
to find how many individuals can be influenced, and then use 
these influence spreads to compare the effectiveness of these 
algorithms. 

Experimental Platform. The experiments were performed 
on a server with 2.0GHz Quad-Core Intel Xeon E5410 and 
8G memory. 



11, Based on the source code from its author, the parameter would be 
selected from (1/10,1/20,1/40,1/80,1/160,1/320,1/1280) 

12. In detail, under the IC model, the node in the seed set propagates its 
influence through the following operations. Let us view the node in the seed 
set S as the node influenced at time t = 0, if node i is influenced at time t, 
then it will influence its not-yet-infiuenced neighbor node j at time t + 1 (and 
only time f + 1) with transmission probability tn. In this paper, as long as the 
transmission probabilities on edges satisfy the confinement of Assumption [TJ 
our method will handle its corresponding influence maximization problem. 
Due to the limited space, in this paper, we set the transmission probability 
tjj as equal to # which is widely adopted in the previous studies and its 
corresponding model is called as Weighted Cascade (WC) Model. 
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6.4. 1 A performance comparison 

In the following, we present a performance comparison of 
both effectiveness and efficiency between our algorithms in 
this paper and the benchmarks. For the purpose of comparison, 
we record the best performance of each algorithm by tuning 
their parameters. We run tests on the six networks under the 
WC model to obtain the results of influence spread. The seed 
set size K ranges from 1 to 50. Figure |4] shows the final 
results of influence spread, where we paint tokens at each 5 
points. In this figure, if two curves are too close to each other, 
we group them together and show properly in the legend. 
Figure [5] shows the computational performance comparison 
for selecting 50 seeds on the best parameters. In this figure, 
for the running time of DIC and Deg is almost 0, we just 
remove their performances on this figure. Due to the running 
time overflow, CELF is failure in networks web-NotreDame, 
DBLP, Livejournal. Due to the memory overflow, CC is 
failure in network Livejournal. 

From Figure @] we could get a beat record for each pair 
of algorithms. For example, if algorithm A beat algorithm B 
in x datasets, then we could get a record (A,B,x). Then we 
put these records into a table where the value in (A, B)-entry 
is x. Table 16.4.11 is the beat table. Moreover, the last column 
of Table 16.4.11 shows the total number of beat times, and the 
last row shows the total number of defeated times. For an 
algorithm, if we use the difference between its total number 
of beat times and defeated times as its strength, we could get 
its position in all of algorithms. The differences of the seven 
algorithms are 25, 29, 10, -5, -5, -20, -34 respectively. Based 
on this number, we could get the order of these algorithms, 
i.e., CI > CC > CF > PMIA = PR > DIC > Deg where ">" 
means "is better than". However, actually, the performance of 
CC is even a little better than CI in five networks excluding 
Livejournal. Because of its failure in Livejournal, its overall 
performance is worse than CI. Besides, we didn't listed CELF 
in the beat table for it only succeed in three networks. But its 
performance is a little better than CC. 

In aspect of running time, we illustrated the computational 
costs of different algorithms on different datasets in Figure [5] 
For the running time of DIC and Deg are almost equal to 
0, we just removed them from the figure. We could see that, 
in this aspect, the order of algorithm is PR > CF > CC > 
CI > PMIA > CELF where ">" means "is faster than". 
Notably, the running time of CF algorithm is almost equal 
to PR which means that CF is a linear time algorithm for 
viral marketing. Based on the discussion in Section 14.31 we 
know that the authority of individual is essentially her total 
influence in the network. Thus, PR could find the top-K 
most influential individuals. However these individuals may 
overlap their influence field for there are no mechanism to 
guarantee that they all have exclusive territories. While CF 
could guarantee it in some way and then it could always beat 
PR. 

Summary. Generally, for solving the viral marketing prob- 
lem, CC and CI perform consistently well on each network, 
when the network size is large-scale, CI is a more proper 
choice. If we want to adopt a more faster algorithm, CF is the 
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best. 

6.4.2 The Impact of A 

We investigate the effect of tuning parameter A on the running 
time of CC and CF and the results of its influence spread. 
Specifically, we set A ranges from 0.05 to 1, step by 0.05, and 
then get the corresponding influence spread and running time. 
And, for a clear view of the influence spread results, we use 
the ratio of their influence spread result relative to CELF's to 
indicate their effectiveness. 

The up row of Figure [6] show the effectiveness of CC and 
CF with different A on Polblogs, Wiki-Vote, ca-HepPh network 
respectively. In these figures, the x axis is the A value; the red 
dash line is y = 1 which indicates the results of CELF; the 
black dash line indicates the optimal value. From these figures, 
we can obtain the following observations: 

. The performances of CC and CF all increased firstly 
and then decreased which follows the same trend ap- 
peared in Figure [2] but the optimal value is reached at 
A = 0.25,0.15,0.25 respectively; they are all reached a 
little later than the peak values in Figure |2j 
. The performance of of CC is very stable. No matter what 
value A is, the difference of effectiveness is less than 0.04, 
and for most of A values, the effectiveness of CC is larger 
than 0.98. 

. The best A located in the range [0.1,0.4]. 

The bottom row of Figure [6] show the running time of CC 
and CF with different A on Polblogs, Wiki-Vote, ca-HepPh 
respectively. On these figures, we can observe that the running 
time of CC is descending with the ascending of A while the 
running time of CF always stays at a constant value. From the 
above observations, we can know that, for CC, if we want to 
get a better effectiveness we should set A to be a number in 
[0. 1 , 0.4] and if we want to get the result efficiently, we should 
set A to be a comparable large value; while for CF, we just 
directly set A to be a number in [0.1,0.4]. 

7 Conclusion 

In this paper, we developed a social influence model based 
on circuit theory for describing the information propagation 
in social networks. This model is tractable and flexible for 
understanding patterns of information propagation. Under this 
model, several upper bound properties were identified. These 
properties can help us to quickly locate the nodes to be con- 
sidered during the information propagation process. This can 
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Fig. 5. The computational performances. 



drastically reduce the search space, and thus vastly improve the 
efficiency of measuring the influence strength between any pair 
of nodes. In addition, the circuit theory based model provides 
a new way to compute the independent influence of nodes and 
leads to a natural solution to the social influence maximization 
problem. Finally, experimental results showed the advantages 
of the circuit theory based model over the existing models in 
terms of efficiency as well as the effectiveness for measuring 
the information propagation in social networks. 
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